Which of the following code blocks will most quickly return an approximation for the number of distinct values in column division in DataFrame storesDF?
Answer : A
The code block shown below contains an error. The code block is intended to return a new DataFrame with the mean of column sqft from DataFrame storesDF in column sqftMean. Identify the error.
Code block:
storesDF.agg(mean("sqft").alias("sqftMean"))
Answer : A
Which of the following operations can be used to return the number of rows in a DataFrame?
Answer : D
Which of the following operations returns a GroupedData object?
Answer : D
Which of the following code blocks returns a collection of summary statistics for all columns in
DataFrame storesDF?
Answer : E
Which of the following code blocks fails to return a DataFrame reverse sorted alphabetically based on column division?
Answer : C
Which of the following code blocks returns a 15 percent sample of rows from DataFrame storesDF without replacement?
Answer : E
Which of the following code blocks returns all the rows from DataFrame storesDF?
Answer : B
Which of the following code blocks applies the function assessPerformance() to each row of DataFrame storesDF?
Answer : D
The code block shown below contains an error. The code block is intended to print the schema of DataFrame storesDF. Identify the error.
Code block:
storesDF.printSchema
Answer : E
The code block shown below should create and register a SQL UDF named "ASSESS_PERFORMANCE" using the Python function assessPerformance() and apply it to column customerSatisfaction in table stores. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.
Code block:
spark._1_._2_(_3_, _4_)
spark.sql("SELECT customerSatisfaction, _5_(customerSatisfaction) AS result FROM stores")
Answer : A
The code block shown below contains an error. The code block is intended to create a Python UDF assessPerformanceUDF() using the integer-returning Python function assessPerformance() and apply it to column customerSatisfaction in DataFrame storesDF. Identify the error.
Code block:
assessPerformanceUDF – udf(assessPerformance)
storesDF.withColumn("result", assessPerformanceUDF(col("customerSatisfaction")))
Answer : A
The code block shown below contains an error. The code block is intended to use SQL to return a new DataFrame containing column storeId and column managerName from a table created from DataFrame storesDF. Identify the error.
Code block:
storesDF.createOrReplaceTempView("stores")
storesDF.sql("SELECT storeId, managerName FROM stores")
Answer : B
The code block shown below should create a single-column DataFrame from Python list years which is made up of integers. Choose the response that correctly fills in the numbered blanks within the code block to complete this task.
Code block:
_1_._2_(_3_, _4_)
Answer : D
The code block shown below contains an error. The code block is intended to cache DataFrame storesDF only in Spark’s memory and then return the number of rows in the cached DataFrame. Identify the error.
Code block:
storesDF.cache().count()
Answer : B
Have any questions or issues ? Please dont hesitate to contact us